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Abstract — Feature extraction and selection is the primary 
part of any mammogram classification algorithms. Statistical 
texture features of mammogram images provide excellent 
classification results in tumor identification. In this paper we 
propose a Computer Aided Diagnosis (CAD) system which uses 
the second order statistical texture features called Gray Level 
Co-occurrence Matrices (GLCM) along with the lazy classifiers 
named K*,IB1 and LWL for the detection and classification of 
different types of abnormalities in mammogram Images. GLCM 
feature measures the relationship between individual pixels 
with respect to its neighboring pixels compared to the normal 
first order statistical features named the histogram and 
intensity features. The classification performance achieved by 
the GLCM features using different machine learning 
algorithms are better than that obtained with first order 
statistical features. Different types of features in the GLCM 
extracted in different direction of the Region of Interest (ROI) 
of mammogram images are put together as the feature vector 
for the classification. This method is applied on three different 
sizes of ROIs extracted from mammogram images in the 
Mini-Mias database. The results obtained on these three sets of 
ROIs are excellent and promising. 

Index Terms — GLCM, IBL, Kstar, Lazy classifier, LWL 

I. INTRODUCTION 

Breast cancer is the one of the most threatening disease 
found among women in all over the world. It stands second in 
position for the cause of deaths in women, especially in the 
developed and under developed countries [1], Breast cancer 
is common among men also. It accounts 1% of total breast 
cancer found in the world [2] [3]. In India itself, breast cancer 
accounts 23% of all female cancers followed by cervical 
cancer which is only 17.5% [4]. There is no effective 
diagnosis methods suggested so far for this disease. The only 
way to decrease the mortality rate of the breast cancer is the 
early detection [5]. The commonly used diagnostic methods 
for breast cancer include biopsy, mammography, 
thermograph and ultrasound image [6], The mammography, 
which is a non-invasive method, is considered as the best 
approach among all other diagnostic methods suggested so 
far [7] [8] [9], In spite of the development in technology in 
modern digital world, early detection and recognition of 
doubtful abnormalities in digital mammogram is a very 
difficult task [5] [10]. The primary reason is that the 
mammography provides relatively low contrast images 
especially in the case of dense or heavy breasts. 
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The symptoms of abnormal tissue also may remain quite 
subtle [11]. The detection of tumors and classification of the 
mammogram images are the standard clinical practice for 
the diagnosis of breast cancer. Soft computing methods such 
as neural networks and fuzzy logics are now available for the 
early detection of cancer cells in a human body, even before 
physical symptoms appear [12]. The biopsy is an approach 
normally used by the radiologist for identifying cancer cells 
manually through a microscope. Biopsy, most of the time, 
do not identify exact tumor locations of the specimen. 
Therefore radiologist performs unnecessary biopsy which is 
time consuming and cause inconvenience to the patient. As a 
measure to improve the diagnosis. Researchers are focusing 
on the development of computer aided detection system for 
identifying tumors or abnormalities from digital 
mammogram. Once an abnormality is detected on the 
mammogram image using the CAD system, then the 
radiologist can recommend for biopsy which in turn reduce 
the need for unnecessary biopsies. In addition, the CAD 
system can also be considered as a second opinion for the 
radiologists to diagnosis the disease [13] [14] [15]. In this 
paper we focus on classification of digital mammogram into 
normal or cancerous, which may lead to the design of 
efficient CAD Systems 

Different techniques have already been proposed to improve 
the accuracy of breast tumor classification. New 
developments must meet or exceed the high standards of 
performance set by the existing algorithms. The common 
CAD systems include image acquisition, enhancement of the 
acquired image, segmentation or extraction of the regions of 
interest followed by extracting features from the region of 
interest and finally the classification for identifying 
abnormalities. Segmentation is an essential step of any CAD 
system since it extract regions those have high probability of 
lesions. It also reduces the amount of data to process so that 
performance of the CAD system can be improved. 
Classification is the final step of the CAD system, which 
identifies the normal or abnormal mammogram images in 
the dataset [18] [19]. 

Mammography lesions such as microcalcifications and 
masses are usually small and low in contrast compared to 
contiguous breast tissue. Therefore they are very hard to 
detect. Image enhancement can improve the accuracy of the 
diagnosis by the radiologists. [18]. Various image 
enhancement techniques like thresholding, low and high pass 
filters, contrast stretching, histogram modeling. Gradient 
operators etc. are used for reducing the noise, suppressing the 
background details and edge sharpening [20]. The usual task 
of mammogram enhancement is to increase the contrast as 
well as sharpen the edges or borders of the mammogram. 
Once an enhanced mammogram image is obtained, the most 
doubtful area where abnormality occurs can be extracted for 
further examination. This extracted portion is called Region 
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of Interest (ROI) of the image. This process is called 
segmentation which usually corresponds to the extraction of 
objects from the background. The segmentation can be done 
in order to locate suspicious area of the mammogram. The 
wavelet based segmentation, fractal models, fuzzy based 
approaches, contour detection are the some of the advanced 
segmentation technique used in mammogram image analysis. 
It is possible that multiple ROIs of the same mammogram be 
extracted for the analysis and classification. Texture 
information plays an important role in the analysis and 
detection of breast tumors in mammograms. Once a 
mammogram ROI is obtained, the suspicious area where 
abnormality occurs can be identified by extracting important 
texture features in the image. These features characterize 
tumors or abnormality in the images [16]. Extracted features 
are then analyzed using different classifiers likes Artificial 
Neural Networks, Hybrid Neural network classifiers, 
K-Nearest Neighbors, Support Vector Machines etc. Fuzzy 
based approaches are also used for classifying the 
mammogram images based on the feature set extracted [23]. 


Computer aided diagnosis of breast tumor is one of the 
challenging task in the field of medical image processing. 
There are good numbers of works already published in this 
area and most of them reported good results. It is a known fact 
that we cannot rely 100% on any of these systems. Hence 
there is scope for further works in this area. The performance 
of a classification system can be evaluated using parameters 
such as sensitivity and specificity [2], An ROI may be 
classified as either cancerous (positive) or normal (negative). 
The final decisions belong to any four possible categories: 
true positive (TP), true negative (TN), false positive (FP) and 
false negative (FN). FN and FP represents two kinds of 
errors. An FN error implies that true abnormality was not 
detected and a FP error occurs when a normal region was 
falsely identified as abnormal image. A TP decision is correct 
judgment of an existing abnormality and a TN decision 
means that a normal region was correctly labeled [2] [17], 
Therefore the accuracy and performance of any CAD system 
is evaluated based on the Sensitivity, Specificity and 
Accuracy. They are defined as follows: 


TP 

Sensitivity = 

(TP + FN) 

TN 

Specificity = 

( TN + FP ) 

C TP + TN ) 

Accuracy = 

(TP + FP + TN + FN) 


( 1 ) 

( 2 ) 

( 3 ) 


In this paper we propose a new multilevel classification 
scheme for classifying mammogram images. The feature 
vector is formed from the Gray Level Co-occurrence Matrix 
(GLCM) value extracted from four different orientations of 
ROIs. A set of lazy classifiers are then used for 
classification. Initially the systems classify the mammogram 
images into normal or abnormal. Then all the abnormal 
images classified in the first level are further classified into 
appropriate categories depending upon architectural as well 
as texture patters found in the image or ROI. 

The rest of the paper is organized as follows: Section II 
discusses about the related study conducted in this area. In 


section III we explain about the creation of GLCM matrix 
and the features extracted from them. Different machine 
learning algorithms used for the classification are discussed 
in section IV. The proposed method for feature extraction 
and the classification is discussed in Section V. Dataset used 
for the experiment and results are explained in Section VI 
and finally the conclusion is given in VI. 

II. RELATED WORK 

Feature extraction is the primary part of any 
mammogram classification algorithms. Commonly three 
types of features namely texture feature, positional features 
and shape features are used for the classification purpose. 
Texture features are the alteration and variation of surface of 
the image that can be characterized as the space distribution 
of gray levels in a neighborhood. Positional features 
describes the location wise gray level distribution of the 
image and shape feature extract the shape of an object in the 
images based on the variation of intensity distribution of the 
gray level. 

There are two types of texture measures: first order and the 
second order. The first order texture measures are based on 
the statistical measures calculated from the pixel value of the 
image whereas the second order texture measures statistical 
features of the pixel value with respect to its neighboring 
pixels. The histogram features and intensity features are 
examples of first order texture features. Intensity features 
and histogram features are the simplest features based on the 
pixel intensities useful for the identification of hidden 
patterns in a mammogram. The GLCM, a second order 
texture feature is extracted based on a group of pixels in an 
ROI. Compared to the standard statistical features, the 
GLCM features have much more relevance due to the 
repeating pattern. 

Different classification techniques are being used for the 
classification of masses in the mammogram images. Linear 
Discriminant Analysis (LDA), Artificial Neural Network 
(ANN), Binary Decision Tree (BDT), Support Vector 
Machines (SVM) and Bayesian Network (BN) are some of 
the prominent classification methods. The performance of 
the above systems mainly depends on the features selection 
rather than the training and classification of the system. A 
review of some of the prominent works in this area is given 
below. 

A comparative study made by the R. Nithya and B Santhi [21] 
on the above feature extraction methods shows excellent 
result with GLCM features compared to other methods. The 
study used a sample of 50 mammogram images from the 
DDSM database. The same authors[22] proposed another 
method for classification of normal and abnormal patterns in 
Digital mammograms for the breast cancer diagnosis using 
GLCM features and ANN. The work reported sensitivity and 
specificity more than 90% for a sample set of 50 digital 
mammogram images from the DDSM dataset. 

A.Mohd Khuzi et.al [17] proposed a method for the detection 
and classification of masses and non-masses in a 
mammogram images using GLCM features. They extracted 
the features from the ROIs which segmented using different 
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segmentation algorithms namely Otsu, Thresholding and 
K-means. The accuracy of the classification is measured with 
sample set consisting of 20 abnormal and 20 normal images 
from the Mini-Mias. The work reported more that 80% for 
both Otsu’s and the thresholding techniques and 70% for 
K-Mean. 

A hybrid feature reduction method namely Linear forward 
selection and genetic algorithm for reducing the GLCM 
feature sets was proposed by Vasantha and Bharathi [25] 
[26], In this work 60 images from DDSM database and 118 
images from Mini-Mias database were used with decision 
tree classifier. They could achieve 86% accuracy with DDSM 
and 95% with Mini-Mias Database. 

Using ANN and GLCM feature, Abdulla and Zaki [27] 
proposed a method for detection of masses in digital 
mammogram and achieved 91 % sensitivity and 84 % 
specificity for classifying 90 mammogram images randomly 
selected from the Mini-Mias database. Islam et al. [28] also 
proposed a classification method using ANN and GLCM 
features to classify benign-malignant classes of mammogram 
images which achieved 90 % sensitivity and 84% specificity. 

HI. GRAY LEVEL CO-OCCURRENCE MATRIX 
(GLCM) 

Feature extraction and selection of the suitable features from 
the extracted set is a very important step in the development 
of any CAD system for the detection and classification of 
mammogram images. The feature can be classified broadly 
into statistical and semantic types. Both categories of features 
have its own advantages and disadvantages for the 
classification task. Feature extraction based on texture 
patterns are the most prominent one for the identification of 
mass/tumors in an image. There are two types of statistical 
texture features that can be extracted for the classification 
purpose. They are first order statistical features and second 
order statistical features. The GLCM is a second order 
statistical feature extracted from an image based on the 
neighboring pixels. The intensity values of neighboring 
pixels form a group which represents certain repeating nature 
of texture pattern in an image. This repeating pattern is local 
to any image portion so that it can be better analyzed. The 
GLCM is a two dimensional array which takes into account of 
the specific position of a pixel relative to other pixels [17], 
The GLCM matrix shows the tabulation of how often 
different combination of pixel brightness values occur in an 
image. Each entry P (i, j) of a GLCM corresponds to the 
number of occurrences of the pair of gray levels i and j which 
are at a distance d apart in original image [29], A single 
direction might not give enough and reliable texture 
information. Therefore the GLCM matrices are constructed 
in different orientation 0, such as 0°,45°,90 0 and 135° at a 
distance of d. The fundamental texture descriptors derived 
from GLCMs namely contrast, energy, homogeneity and 
correlation of the gray levels used as the features for the 
classification. The contrast measures the amount of local 
variations present in an image, while energy is the sum of 
squared elements in GLCM. Energy may also be referred as 
uniformity of the angular second moment. The homogeneity 
refers to the closeness of the distribution of elements in 
GLCM to the GLCM diagonal. Finally correlation shows 


how correlated a pixel is to its neighbor over the whole image 
[17]. These measures are mathematically defined as follows. 


Contrast = 

N - 1 
i 0 

(4) 

Energy = 

N N 

YLpI 

i - 1 j - 1 

(5) 

Homogeneity 

N - 1 P 

v y 

(6) 

^\ + (i-j) 2 


Correlation = — — (7) 


IV. Lazy classifiers 

A classification problem occurs when an object needs to be 
assigned to a predefined group or class based on a number of 
observed attributes related to that object [32]. Different types 
of classification algorithms are available today for the 
classification in which eager learning and instance based 
learning algorithms are most prominent. Lazy learning 
classifiers are instance based or memory based classification 
algorithm proposed against the common eager learning 
algorithms. They are the important category of classifiers that 
can be implemented and tested easily with minimum cost. 
This learning algorithm utilizes a kind of distance measure 
between test instances and training instances for the 
classification. Entropy and distance measures are the two 
common methods adopted by the Lazy classifiers. 

The common eager learning methods eagerly compile the 
training data into some concept descriptors such as rule sets, 
decision trees, artificial neural network and graphical models. 
After constructing such type of models, common eager 
learning methods attempt to seek a particular general 
hypothesis, which covers the entire instance space. But the 
lazy learning models do not conduct any processing of 
developing a model for classification before they encounter 
the unseen instance to be classified. The lazy classifier 
constructs model only when they are directed to classify the 
unseen instance and discard all the customized models and all 
the intermediate results after the learning process for the 
unseen instance completes. Therefore lazy learning 
algorithms need much less training costs but more storage 
and computational resources than the eager algorithms. Lazy 
learning algorithms can make use of the characteristics of the 
unseen instance to explore a richer hypothesis space during 
the classification. Due to this richer hypothesis space, lazy 
learning methods outperform significantly some of the 
common eager learning methods. 

Lazy learning exhibits many advantages in learning 
scenarios. Common eager learning methods need to learn a 
new global classifier every time the training data is updated. 
When the training data is large and complex, it is not 
economical for the service provider to conduct eager learning 
frequently. Lazy learning methods have no such problems. 
Generally, the updating of training data is the only operation 
required by lazy learning methods. Another learning scenario 
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for which lazy learning is competitive is that the learning 
target class is not fixed and the attribute set is large. Lazy 
learning handles each classification as an independent 
learning process, and hence it can be customized to the 
unseen instance and focuses only on the local data patterns 
[33]. In this paper we use three different instance based 
classifiers K*. IBL and LWL algorithms. 

A. K* classifier 

K * is an instance based classifier that classifies the test 
instance based on the classes of those training instances 
similar to the test instance determined by some similarity 
functions. The similarity function is determined by using 
entropy as a distance measure. The result obtained by this 
method is comparatively better than the several other 
machine learning algorithms. 

The similarity function computes the similarity between a test 
instances against the instances in the concept descriptor 
computed using the training instances in the samples. Let x, _ 
y, denotes test instance and concept descriptor respectively, 
then the similarity function between these two instances are 
computed by using the following equation 

Similarity (x ; , y t ) = -^^/(x ; ,y ; ) (8) 

Where the instances are described by n attributes. We define 
f(x i ,y j ) = (x i — ) 2 for numeric valued attributes and 

f (x i , v ) = (x- # y t ) for Boolean and symbolic -valued 

attributes. Missing attributes values are assumed to be 
maximally different from the value present. If the both 
instances are missing, then f(x i ,y i ) yields 1. The function 
f (x, y ) is the entropy computed using the concept 
descriptors of the training samples using the equation 

E(S') = '^ j p i log p t (9) 

i 

s i 

Where p t = , .S', denotes the number of training 

1 I 

instances with class Q , and 151 = ^ S t be the total number of 
training instances. 

B. IBL Classifier 

Storing and using specific instances improves the 
performance of several supervised learning algorithms. 
Instance-based Learning algorithm generates classification 
prediction using only specific instances. It does not maintain 
a set of abstractions derived from specific instances. This 
approach extends the nearest neighbor algorithm which 
requires large storage requirements similarity function is used 
for categorizing the matches between testing samples against 
specific instances. Using these specific instances. Instance 
based learning algorithm reduces the cost incurred for 
updating concept descriptors and increases the learning rates. 
Instance based learning algorithm is derived from the nearest 
neighbor pattern classifier, which uses only selected instances 
to generate classification prediction. Therefore 
instance-based learning algorithm reduces storage 
requirements and at the same time there is small degradation 
in classification accuracy [34] 


Each instance in IBL classifier is represented by a set of 
attribute-value pairs. This set of attributes defines an 
n-dimensional instance space. Exactly one of these attributes 
corresponds to the category attribute; the other attributes are 
predictor attributes. A category is the set of all instances in an 
instance space that have the same value for their category 
attribute. However, IBL algorithms can learn multiple 
overlapping concept descriptions simultaneously. The 
concept description is a function that maps instances to 
categories that yields the classification. An instance-based 
concept description includes a set of stored instances and 
possibly some information concerning their past performance 
during the classification. This set of instances can change 
after each training instance is processed. However, IBL 
algorithms do not construct extensional concept descriptions. 
Instead, concept descriptions are determined by how the IBL 
algorithms selected similarity and the classification functions 
uses the current set of saved instances. The classification 
function determines how the set of saved instances in the 
concept descriptions are effectively used to predict the values 
for the category attribute. 

The IBL classification function used for defining concept 
description have the following components: 

1. Similarity Function: This computes the similarity between 
a testing instances i and the instances in the concept 
description. . 

2. Classification Function: This receives the similarity 
function's results and the classification performance records 
of the instances in the concept description. It yields a 
classification for the instance i. 

3. Concept Description Updater: This maintains records on 
classification performance and decides which instances to 
include in the concept description. Inputs include i, the 
similarity results, the classification results, and a current 
concept description. It yields the modified concept 
description. 

The similarity and classification functions determine how the 
set of saved instances in the concept description are used to 
predict values for the category attribute. Therefore, IBL 
concept descriptions not only contain a set of instances, but 
also include these two functions. 

IBL algorithms differ from most other supervised learning 
methods: they do not construct explicit abstractions such as 
decision trees or rules. Most learning algorithms derive 
generalizations from instances when they are presented and 
used for simple matching procedures to classify subsequently 
presented instances. This incorporates the purpose of the 
generalizations at the presentation time. IBL algorithms 
perform comparatively little work at the presentation time 
since they do not store explicit generalizations. However its 
work load is higher when presented with subsequent instances 
for classification, at which time they compute the similarities 
of their saved instances with the newly presented instance. 
This obviates the need for IBL algorithms to store rigid 
generalizations in concept descriptions, which can require 
large updating costs to account for prediction errors. [35] 
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C. LWL classifier 

Lazy learning methods defer processing of training data until 
a query needs to be answered. This usually involves storing 
the training data in memory, and finding relevant data in the 
database to answer a particular query. Relevance is often 
measured using a distance function, with nearby points 
having high relevance. One form of lazy learning finds a set 
of nearest neighbors and selects or votes on the predictions 
made by each of the stored points. [33] 

Locally Weighted Learning (LWL) is lazy classifier that uses 
statistical learning techniques for training and classifying 
complex tasks. It provides an approach to learning models of 
complex phenomena, dealing with large amounts of data, 
training quickly, and avoiding interference between multiple 
tasks during control of complex systems. LWL methods can 
even deal successfully with high dimensional input data that 
have redundant and irrelevant inputs while keeping the 
computational complexity of the algorithms linear in the 
number of inputs. [33] [34] LWL methods come in two 
different strategies. Memory-based LWL is a “lazy learning” 
method that simply stores all training data in memory and 
uses efficient lookup and interpolation techniques when a 
prediction for a new input has to be generated [33] [34]. This 
kind of LWL is useful when data needs to be interpreted in 
flexible ways, for instance, as forward or inverse 
transformation. Memory-based LWL is therefore a “least 
commitment” approach and very data efficient. 
Non-memory-based LWL has essentially the same statistical 
properties as memory based LWL, but it avoids storing data 
in memory by using recursive system identification 
techniques [37]. In this way, non-memory-based LWL caches 
the information about training data in compact 
representations, at the cost that a flexible re-evaluation of 
data becomes impossible, but lookup times for new data 
become significantly faster. 

V. PROPOSED METHOD 

The proposed method presents a novel approach for 
computer aided diagnosis (CAD) system for the detection of 
the abnormalities in breast tumors. It consists of two levels of 
classification; first, a method is devised for identifying and 
classifying the risk level of the breast mammograms, i.e 
normal, benign and malignant. In the second level, all the 
abnormal images in the dataset are used for further level of 
classification based on the types of abnormalities or 
distortion such as calcification, asymmetric distortion, 
architectural distortion, circumference masses, speculated 
and ill defined masses. The architecture of the proposed 
system is given in Fig 1. 

Mammogram Acquisition 

I 

ROI Extraction 

I 

Construction of OLCMs 

I 

Feature Vector Generation 

I 

First Level Classification 

1 

i 1 — i ; — i 

I N 01 ™ 1 I Abnormal I 

Second Level Classification I 

! | | I ^ ^ I 

CALC ARCH | ASYM | CIRC | MISC SP1C 

Fig 1 : Architecture of the proposed system 


For classification GLCM features discussed in section III are 
extracted from the ROIs of the dataset. The GLCM matrices 
are generated in four different orientations for three different 
sizes of ROIs (8 x 8, 16 x 16 and 32 x 32 pixel sizes). The 
GLCM s are constructed by taking pair of image cells at d - 1 
distance apart and incrementing the matrix position 
corresponding to the gray level of both cells. Thus the system 
generated four different GLCMs in four different orientations 
such as 0°,45°,90°and 135° . Then the system extracts features 
such as contrast (C), Energy (E), Homogeneity (G) and 
Correlation (R) of the gray level values in the GLCM matrix 
of the ROIs. All the four features extracted from the different 
orientations of the GLCM matrix are combined together to 
form a feature vector, which comprises a set of 16 features. 
This feature vector acts as the basis for the classification. 

The classifier is trained using the feature vector so extracted 
for the different sets of ROIs of size 8x8, 16x16 and 32x32 
from the images in Mini-Mias database. The most common 
lazy learning algorithms such as K*, IB 1 and LWL are used 
for training and testing of the ROIs. The training and testing 
datasets of the ROIs are prepared by dividing the entire 
dataset of the feature vector into ten different folds of equal 
sizes. Then nine different folders of the dataset are used for 
training and the remaining one folder of feature dataset is 
used for testing. The processes of training and testing are 
repeated for each set of folders and the performance is 
evaluated by taking the average of test result obtained in each 
folder. 

Algorithm for Mammogram Classification using Lazy classifier 

1 : Extracted Mammogram ROIs of different sizes like 32 x 32 
pixels, 16x16 pixels and 8x8 pixel sizes based on the 
abnormality center from the original mammogram images 
of size 1024 x 1024 pixels from the Mini-Mias database. 

2: From the ROIs extracted of different pixel size like 32 x 
32, 16 x, 16 and 8x8, the Gray level co-occurrence 
matrices in four different orientations (0°, 45°, 90° and 135° 
) are constructed at unit distance. 

3: The features like contrast, Energy, Homogeneity and 
Correlations are computed for each GLCM constructed in 
step 2. 

4: Formed a feature vector of 16 features which comprising 
the features computed at step 3 in all four different GLCMs 
constructed in a mammogram image. 

5:The feature vector computed in step 4 is grouped as training 
and testing set for classification. 

6: Using Weka lazy classifier K*, IBL and LWL, the training 
set is used for training. 

7: The performance of the lazy classifier is evaluated on 
testing dataset. 

VI. DATASETS AND RESULTS. 

A. Dataset 

Mammogram images are the low intensity gray scale images 
which show the details of the patient breast in terms of pixel 
values or intensity distribution inside of it. The details could 
be normal tissues, vessels, muscles, different types of masses 
and noise. Each type of masses has different properties of 
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shape, size and brightness that are described in terms of its 
intensity distribution of the image. Generally these properties 
are measured as different features of the mammogram images 
of the dataset. The radiologist makes use of these features for 
the effective diagnosis of the breast tumor if it is identified. 

In this study we used a set of mammogram images provided 
by Mammographic Image Analysis Society (Mini-MIAS) 
[24], The database contains left and right breast investigated 
and labeled by an expert radiologist. From these dataset 
regions of interest (ROIs) of different sizes (8 x 8, 16 x 16 and 
32 x 32 pixels) are extracted for our investigation. The ROIs 
are extracted from the original mammogram images are based 
on the abnormality center of the cancerous images that are 
already marked by the radiologist. But the non-cancerous 
images are extracted with respect to the center of the original 
mammogram image. For practical evaluation of the proposed 
system, the entire dataset which comprises 322 ROIs of 
different types of lesions as shown in Table I are used. While 
extracting ROIs of cancerous images, multiple abnormal 
regions are extracted and they are treated as a separate ROIs 
of the same image and it is also included the dataset for the 
classification. 

VII. Results 

The dataset used for the experiment comprises of 330 ROIs 
extracted from 322 mammogram images from the Mini-Mias 
database. The set consists of 207 normal, 54 malignant and 
69 benign images. The different sizes of ROIs (8x8, 16x16 
and 32 x 32 size of pixels) of each mammogram image in the 
dataset are extracted based on the abnormality center of the 
image. Then we formed three different sets of GLCM feature 
vector for each size of ROIs and classified using three 
different lazy classifiers namely K*, IBL and LWL available 
in Weka software. The classification is done in two levels. In 
the first level of the classification the proposed algorithm 
identified the risk level of the images in the dataset such as 
Normal, Benign and Malignant. The confusion matrix 
generated by the different classifier for the first level 
classification is shown in Table II( see appendix). 

Based on the confusion matrix generated by the classifier, the 
performance of the different classifiers with varying ROI 
sizes are evaluated. The evaluation result is shown in Table 
III. Now we could arrive at the following conclusions 
regarding our algorithm. The classification accuracy obtained 
for K* is the best followed by IB 1 and LWL. The accuracy of 
the classification algorithm shows significant increase on 
increasing the size of ROIs irrespective of the classifiers. 
Irrespective of the ROIs size, the performance of the LWL 
classifier is poor compared to other two classifiers. The 
performances of our algorithm using three lazy classifiers are 
also shown in Fig 2. 


Table 1: Lesion distribution of MIAS database 

LESION 

RISK 

# 

Normal 

207 

Architectural 
distortion [ARCH] 

Benign 

09 

Malignant 

10 

Asymmetry [AS YM] 

Benign 

06 



Malignant 

06 

Microcalcification[CALC] 

Benign 

12 

Malignant 

13 

Circumscribed 
masses [CIRC] 

Benign 

19 

Malignant 

04 

Ill-defined masses [MISC] 

Benign 

06 

Malignant 

08 

Spiculated lesions [SPIC] 

Benign 

11 

Malignant 

08 

Total 


322 


Table III: Classification accuracy (in %) of mammogram 


images using different Lazy classifiers. 


ROI Size 

K* 

IB1 

LWL 

8x8 

73.33 

72.73 

63.94 

16 x 16 

83.33 

83.03 

63.64 

32x32 

92.40 

92.10 

63.83 



Fig 2: Performance evaluation of the different lazy classifiers 
in the first (primary risk) level classification. 


In the second level of the classification, classifiers are trained 
to classify all the abnormal images in the dataset into different 
classes such as calcification, Architectural distortion. 
Asymmetric distortion, circular distortion, ill defined and 
speculation. The confusion matrix generated by the three lazy 
classifiers in the second stage of the classification is shown in 
Table IV (see appendix). Based on the confusion matrix; the 
classification accuracy obtained by three lazy classifiers is 
shown in Table V. The table reveals that the performance of 
the algorithm is good for K* and IB 1 of ROI size 32x32 pixel. 
As stated in the first level classification, the performance of 
the sub classification also improves significantly on 
increasing size of ROIs. Finally, the performance of the LWL 
classifier is very poor irrespective of the ROIs size compared 
to other two classifiers. Graphical representations of the 
performance of the classification algorithms are shown in 
figure V. 


Table V: Classification accuracy (in %) of mammogram 
images using different Lazy Classifiers 


ROI Size 

K* 

IB1 

LWL 

8x8 

50.41 

50.41 

33.33 

16 x 16 

67.48 

65.04 

38.21 

32x32 

86.18 

86.18 

37.40 
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Fig 3: Performance evaluation of the different Lazy 
classifiers in sublevel risk. 


VIII. CONCLUSION 

The second order statistical texture features play a significant 
role in the classification of any CAD system. In this paper we 
proposed an automatic classification system for classifying 
mammogram images in two different stages. In the first stage, 
the system classifies the images in the dataset into normal, 
malignant and benign types. In the second stage of the 
classification all the abnormal images in the dataset are 
further classified into different sub categories of 
abnormalities. The feature vectors used for the classification 
are generated based on the GLCM matrix constructed in 
different orientations of the ROIs of the mammogram images. 
Finally, classification is done using different lazy classifiers - 
K*, IB1 and LWL. The performance of the system is 
measured using the accuracy obtained by the different 
classification algorithms. It is observed that the performance 
of the proposed system with second order statistical feature is 
excellent. 
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Table II: Confusion matrix generated by different lazy classifiers on Mini-Mias Database 




8x8 

16 x 16 

32x32 

K * 


N 

M 

B 

N 

M 

B 

N 

M 

B 

N 

206 

0 

1 

206 

1 

0 

206 

0 

1 

M 

33 

21 

0 

16 

38 

0 

10 

42 

1 

B 

54 

0 

15 

37 

1 

31 

13 

0 

56 

T 

293 

21 

16 

259 

40 

31 

229 

42 

58 

IB1 


N 

M 

B 

N 

M 

B 

N 

M 

B 

N 

207 

0 

0 

207 

0 

0 

206 

0 

1 

M 

33 

21 

0 

18 

36 

0 

10 

42 

1 

B 

57 

0 

12 

37 

1 

31 

13 

0 

56 

T 

297 

21 

12 

262 

37 

31 

229 

42 

58 

LWL 


N 

M 

B 

N 

M 

B 

N 

M 

B 

N 

207 

0 

0 

205 

0 

2 

207 

0 

0 

M 

50 

4 

0 

52 

2 

0 

51 

2 

0 

B 

69 

0 

0 

65 

1 

3 

67 

1 

1 

T 

32 6 

4 

0 

32 2 

3 

5 

325 

3 

1 


N: Normal M: Malignant B: Benign 


Table IV: Confusion matrix generated by different lazy classifiers on Mini-Mias Database 




8x8 Pixel Size 

16 x 16 Pixel Size 

32 x 32 Pixel Size 

K* 


1 

2 

3 

4 

5 

6 

1 

2 

3 

4 

5 

6 

1 

2 

3 

4 

5 

6 

1 

30 

0 

0 

0 

0 

0 

17 

0 

0 

12 

1 

0 

30 

0 

0 

0 

0 

0 

2 

11 

7 

0 

1 

0 

0 

0 

8 

0 

10 

1 

0 

6 

13 

0 

0 

0 

0 

3 

9 

0 

5 

0 

0 

1 

0 

0 

10 

5 

0 

0 

2 

0 

13 

0 

0 

0 

4 

18 

0 

0 

7 

0 

0 

0 

0 

0 

25 

0 

0 

4 

0 

0 

21 

0 

0 

5 

8 

0 

0 

0 

6 

1 

0 

0 

0 

3 

12 

0 

3 

0 

0 

0 

12 

0 

6 

12 

0 

0 

0 

0 

7 

0 

0 

0 

8 

0 

11 

2 

0 

0 

0 

0 

17 

T 

88 

7 

5 

8 

6 

9 

17 

8 

10 

63 

14 

11 

47 

13 

13 

21 

12 

17 

IB1 


1 

2 

3 

4 

5 

6 

1 

2 

3 

4 

5 

6 

1 

2 

3 

4 

5 

6 

1 

30 

0 

0 

0 

0 

0 

30 

0 

0 

0 

0 

0 

30 

0 

0 

0 

0 

0 

2 

11 

8 

0 

0 

0 

0 

11 

8 

0 

0 

0 

0 

6 

13 

0 

0 

0 

0 

3 

9 

0 

6 

0 

0 

0 

5 

0 

10 

0 

0 

0 

2 

0 

13 

0 

0 

0 

4 

18 

2 

0 

5 

0 

0 

15 

0 

0 

10 

0 

0 

4 

0 

0 

21 

0 

0 

5 

8 

0 

1 

0 

6 

0 

4 

0 

0 

0 

11 

0 

3 

0 

0 

0 

12 

0 

6 

12 

0 

1 

0 

0 

6 

8 

0 

0 

0 

0 

11 

2 

0 

0 

0 

0 

17 

T 

88 

10 

8 

5 

6 

6 

73 

8 

10 

10 

11 

11 

47 

13 

13 

21 

12 

17 
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2 

3 

4 

5 

6 

1 

2 

3 

4 

5 

6 

1 

2 

3 

4 

5 

6 

LWL 

1 

30 

0 

0 

0 

0 

0 

14 

0 

0 

14 

2 

0 

28 

0 

0 

0 

2 

0 

2 

15 

0 

0 

0 

4 

0 

5 

3 

1 

10 

2 

0 

17 

0 

0 

1 

1 

0 

3 

11 

0 

1 

0 

3 

0 

5 

0 

2 

6 

2 

0 

7 

0 

0 

2 

6 

0 

4 

20 

0 

0 

1 

4 

0 

3 

0 

0 

22 

0 

0 

15 

0 

0 

10 

0 

0 

5 

8 

0 

0 

0 

7 

0 

5 

0 

0 

4 

6 

0 

4 

0 

0 

4 

7 

0 

6 

13 

0 

1 

0 

3 

2 

10 

0 

0 

9 

0 

0 

14 

0 

0 

3 

1 

1 

T 

97 

0 

2 

1 

21 

2 

42 

3 

3 

65 

12 

0 

85 

0 

0 

20 

17 

1 


1: CALC 2: CIRC 3: ARCH 4: ASYM 5: MISC 6: SPIC 
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